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Abstract. It is more or less agreed on that the Frechet mean is the right 
definition of an average in Hadamard spaces. On the other hand it is not so 
obvious how to compute this mean, and to the best of our knowledge, no algo- 
rithm for computing the Frechet mean in Hadamard spaces is hitherto known. 
The main purpose of the present paper is to introduce such an algorithm. 

In applications, computing Frechet means is probably most needed in the 
tree space, which is an instance of Hadamard spaces, invented by Billera, 
Holmes, and Vogtmann (2001) as a tool for averaging phylogenetic trees. It 
turns out, however, that it can also be used to model numerous other tree-like 
structures. Since there now exists a polynomial-time algorithm for computing 
geodesies in the tree space due to Owen and Provan (2011), we obtain an 
efficient algorithm for computing Frechet means, which can be directly used 
in practice. 



1. Introduction 

Many objects in nature have a structure of a (combinatorial) tree, for instance 
bronchial tubes in lungs, veins, transport systems in plants. Furthermore, trees 
have also been indispensable in phylogenetic evolutionary models. 

For various reasons, biologists need to compare such trees, measure the difference 
between a given pair of trees, and also want to find an average tree of a given 
set of trees. In order to do so, one needs a space whose elements are trees. It 
should be equipped with a metric (in order to measure distances), and should 
also admit a robust definition of an average. Such a space was constructed by 
L. Billera, S. Holmes, and K. Vogtmann in [2] and named the tree space. In the 
same paper, the authors proved that the tree space is a metric spaces of nonpositive 
curvature, that is, an Hadamard space. This property turns out to be important 
from both the theoretical and computational point of view. Moreover, one can 
expect that exploring the rich geometrical structure of Hadamard spaces will yield 
further applications in this area. 
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In an Hadamard space ("H, d), an average of a set of points xi, . . . , xn G "H can 
be defined as the Frechet mean, that is, by 

N 

(1) S (xi, . . .,xn) = argmin (i(y,a;„)^ . 

n=l 



The correctness of this definition is guaranteed by Theorem 2.4 below. Furthermore, 
the Frechet mean behaves nicely when we perturb the points xi, . . . ,xn, more 
precisely, it is Lipschitz continuous in its variables; see Theorem |2.4| Note that the 
property, which assures that the Frechet mean is well defined, is the nonpositive 
curvature of an Hadamard space. 

Since averages are of immense importance in many applications, we need an 
algorithm for computing the Frechet mean. The main purpose of the present paper 
is to introduce such an algorithm. Research on this matter has already started, 
see for instance [51 Three different methods (centroid, Birkhoff's shortening, 
and weighted averages) for computing the Frechet mean were introduced in [51 
Section 4]. Unfortunately, none of these methods gives a correct result, as will be 
demonstrated in Remark 1 4. II below. 

Our approach to computing the Frechet mean is based on the law of large num- 
bers, which was in the context of Hadamard spaces proved by K.-T. Sturm [15]. It 
turns out that Sturm's result can be rather directly used as an algorithm, and its 



convergence to a correct value is assured; see Theorem 2.5 The resulting algorithm 
requires finding a geodesic segment at each of its iterations. While this can be a 
very difficult task in a general Hadamard space, geodesies in the tree space are 
now quite well-understood. Indeed, they were described already in [5], and in more 
detail in [TP. From the practical point of view, it is important that we can find 
a geodesic in the tree space in polynomial time. This is possible due to a recent 
ingenious algorithm by M. Owen and J.S. Provan IJJJ. 

Potential applications of our novel algorithm might be found in various areas of 
computational biology, as alluded at the beginning of this Introduction. We refer 
the interested reader to [H IHl HI] and the references therein. 

We have already implemented our algorithm and carried out first test computa- 
tions. A large computational study based on real data is now in preparation, and 
will appear in a separate paper. 

Let us now describe the rest of the present paper. In Section [2] we recall basic 
facts on Hadamard spaces including the definition of the Frechet mean, and the law 
of large numbers. Section[3]is devoted to trees and the tree space. The algorithm for 
computing the Frechet mean in Hadamard spaces is introduced in Section [4j where 
we also give a detailed description of the Owen-Provan algorithm for computing 
geodesies in the tree space. 

Acknowledgments. I am grateful to Martin Kell for helpful discussions about 
the law of large numbers. 



2. Hadamard spaces 

Since the notion of the Frechet mean as well as the law of large numbers heavily 
rely on the geometry of Hadamard spaces, we start by recalling rudiments of the 
theory of these spaces with a special regard to the aforementioned results. As a 
reference, we recommend [5] or [5]. 



COMPUTING THE FRECHET MEAN 



3 



2.1. Definition of an Hadamard space. A metric space {X, d) is called geodesic 
if for any pair of points x,y X there exists a geodesic whicli connects them. That 
is, there exists a mapping 7 : [0, 1] — > X such that 7(0) — a;, 7(1) = y, and 



for s,t (z [0, 1]. A geodesic metric space {X, d) has nonpositive curvature if for any 
z ^ X and any geodesic 7 : [0, 1] — X we have 



(2) d (z, 7(t))2 < (1 - t)d {z, 7(0))' + td {z, 7(1))' - t{l - t)d (7(0), 7(1))' , 



whenever t e [0,1]. This inequality says that geodesic triangles are thinner than 
the corresponding triangles in the Euclidean plane of the same side lengths. One 
can also show that ([2| implies that each pair of points is connected by a unique 
geodesic. 

Given a pair of points x,y d X, we denote (1 — t)x + ty = "/(t), where 7 is the 
geodesic connecting x and y, and t g [0, 1]. 

Definition 2.1. A complete geodesic metric space of nonpositive curvature is called 
an Hadamard space. 

We will now recall an inequality which goes back to the work of Reshetnyak. Its 
modern proof can be found in [TBI Proposition 2.4], or in (TUl Lemma 2.1]. 

Lemma 2.2. Let {T-L^d) he an Hadamard space. Then we have 

d{x, y)' + d{u, u)^ < d{x, u)' + d{y, u)^ + 2d{x, u)d{y, v), 

for any points x, y,u,v £ %. 

Here we collect several examples of Hadamard spaces. 

Example 2.3. The class of Hadamdard spaces encompasses many diverse spaces 
including 

(i) Hilbert spaces, 

(ii) hyperbolic spaces, 

(iii) complete simply connected Riemannian manifolds of nonpositive sectional 
curvature, 

(iv) M-trees, and 

(v) Euclidean buildings. 

Another important instance of an Hadamard space is the tree space, see Section [3] 

It turns out that Hadamard spaces admit a natural generalization of the notion 
of convexity. Indeed, let (7^, d) be an Hadamard space. We say that a set C C "H is 
convex provided x,y G C implies (1 — t)x + ty G C for any t G [0, 1]. Furthermore, 
we say that a function / : H — ^ E is convex if the function / o 7 : [0, 1] — >■ K is 
convex for any geodesic 7 : [0, 1] — ?► 

2.2. The Freciiet mean in Hadamard spaces. Recall that the Frechet mean of 
a finite collection of points xi, . . . , xn G H, was in ^ defined as 



Some authors alternatively use the name Karcher mean. The existence and unique- 
ness of the minimizer in the definition is a consequence of nonpositive curvature. It 



d{j{s),j{t))^d{x,y) \s-t\, 



N 



S = =(a;i. 



, xn) — argmm 
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is guaranteed by the following theorem, which comes from ^ Theorem 3.2.1], [161 
Proposition 4.4], and [TOl Lemma 4.2]. 

Theorem 2.4. Let ("H, d) 6e an Hadamard space, and xi, . . . , xn € H be a finite 
set of points. Then there exists a unique point S defined in ([I]). Furthermore, this 
S satisfies the variance inequality 

„ 2 1 ^ „ 2 1 ^ 
(3) d {z, + j^^d (s, Xnf < ^"^diz, Xnf , 

n—l n—1 

for any z £%. Finally, the function S — S(-) satisfies 

d{^{xi, . . . ,xn) ,^{x\, . . . ,x'f^)) < --^d{xi,x'i) , 



for any xi, . . . , xn G "H, and x[, . . . , x'pf G H. 

Proof. We are to show that there exists a unique minimizer of the function 



N 



'P ■ y '-^^d{y,Xnf , yen. 



n=l 



The function ip is bounded from below by 0. Take a minimizing sequence (yk) C H, 
that is, a sequence such that ip (yk) — > inf tp. The inequality ^ yields that {yk) is 
Cauchy. Indeed, if yki denotes the midpoint of yk and yi, then ^ gives 

d{yki,Xnf < ^d{yk,Xnf + ^d{yi,Xnf - ^d{yk,yif . 

Summing from n = 1 to iV easily gives that {yk) is Cauchy. Since ip is continuous, 
the sequence {yk) converges to a minimizer of ip. The uniqueness of this minimizer 
follows again from ([2|. It remains to show ([3|. Employing ^ yields 



N 



N 



Y,d{E,Xn)' <{l~t) 



N 



N 



n=l 



+ t 



N 



N 



E d (7(1), x„)^ - E d Xn)'^ 



n=l 
^2 



-iVi(l-t)d(7(0),7(l))^ 
for any geodesic 7 : [0, 1] — > H. Setting 7(0) — E and 7(1) — z gives 



N 



N 



n=l 



0<Ed(7(i),x„)'-Ed(S,x„)^ 

= 1 ri=l 
" AT Af 

E d (z, a;„)^ - E d (S, a;„)^ 



< i 



iVf(l-Od(S,z)^ 



for any t € (0, 1). Dividing by t and letting t — )• yields (|3|. 

If we denote S = S (xi, . . . , Xat) , and S' = S (x'^, . . . , a;'^) , then Lemma 2.2 
yields 

d (x„, S')' + d S)' < d (x,, S)' + d (x^, S')' + 2d (S, S') d {xn, x'J , 
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multiplying by 1 /N and summing over n from 1 to further gives 

N ^ N 



TV ^ 

n— 1 n— 1 



, 1 ^ 



n=l 



TV 

TV 

By the variance inequality ([S]) we have 

AT N 
n— 1 n— 1 

+ 2d(S,S'f. 

Altogether we obtain 

1 ^ 

d(S,S')< 

1=1 

which finishes the proof. □ 

2.3. The law of large numbers. The Frechet mean is related to the law of large 
numbers, both in a classical linear setting and in Hadamard spaces |15| . For a 
historical remark we refer the interested reader to [15j Remark 2.7a]. As we shall 
see in the sequel (more precisely in Section[4|, the probabilistic point of view enables 
one to find an algorithm for computing the Frechet mean. Let again xi, . . . , xpf H, 
and denote the probability measure 

1 ^ 

(4) -=]v^'^-' 

n=l 

where Sx„ stands for the Dirac measure at x„. Assume that F is a random variable 
with values in distributed according to tt. Then the variational inequality (|3| can 
be written as 

(5) d{z,Ef +Ed{E,Yf <Ed{z,Yf , zGH, 

where the expectation E is of course taken with respect to the distribution tt. 

Given a sequence of random variables with values in H, we define a se- 
quence (Si) of random variables putting Si ~ Yi, and 

i 1 

(6) Si+i = -— 5*1 + — — 

I + 1 t + 1 

for i € N. The random variables Yi, and hence also Si, are defined on some proba- 
bility space but this space O of course plays no role here. The following theorem 
states a nonlinear version of the law of large numbers. It appeared in a much more 
general form in |15') Theorem 2.6]. 

Theorem 2.5 (The law of large numbers). Let {H,d) he an Hadamard space, 
and {Yi) he a sequence of independent random variables Yi : Q. ^ %, identically 
distributed according to the distribution tt, defined in Q. Then 

Si E{xi, . . . ,xn) , as i ^ CO, 

where the convergence is pointwise. 
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Proof. First denote 



1 ^ 



n=l 



We show by induction on i e N that 
(7) 



It obviously holds for z = 1 and we assume it holds for some i G N. We have 

2 



Ed (S, S,+if = Ed ( S, + ) , 

' t + 1 I + 1 



by ^ we get 



and applying independence and ^ gives 



t + 1 t + 1 



< ^Ed{E,S,f + ^Ed{E,Y,+,f 



2 



E 



diE,S,f +d{E,Y,+,f 



^) EdiE,S,f + -^C 
i + lj (i + l)-^ 

1 



t + 1 



This shows that ([t]) holds, and hence the proof is complete. 



□ 



3. Trees and the tree space 



We will now describe the construction of the tree space due to L. Billera, S. Hol- 
mes, and K. Vogtmann. For the details, the interested reader is referred to the 
original paper [2]. We first need to make precise what we mean by a tree. Given 
n G N, a metric n-tree is a combinatorial tree (connected graph with no circuit) 
with n + 1 terminal vertices called leaves that are labeled 0, 1, . . . , n. The leaf with 
the label is called a root, but it will have no distinguished role in our consider- 
ations. Vertices other than leaves have no labels since we consider them just as 
'branching points'. The edges which are adjacent to a leaf are called leaf edges, 
and the remaining edges are called inner. We see an example of a 6-tree with three 
inner edges ei,e2, and 63 in Figure [ij All edges, both leaf and inner, have positive 
lengths. Instead of a metric n-tree, we will write simply a tree. The number n will 
be fixed and clear from the context. Later, when we consider multiple trees, it will 
be important that they all have the same number of leaves. 

Each inner edge of a tree determines a unique partition of the set of leaves L 
into two disjoint and nonempty subsets Li U L2 = L called a split, which we denote 
Li|L2- A split is defined as the disjoint union of the two sets of leaves which arise 
when we removed the inner edge under consideration. For instance the inner edges 
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Figure 1. An example of a 6-tree with three inner edges. 

ei, 62, and of the tree in Figure [l] have sphts (0, 4, 5, 6|1, 2, 3), (0, 1, 2, 3|4, 5, 6), 
and (0, 1, 2,3,4|5,6), respectively. On the other hand, having a set of leaves and 
splits subject to certain conditions, we can uniquely construct a tree [HHl]. Namely, 
we require any two splits Li\L2 and L[\L2 be compatible, that is, one of the sets 

LinL'2, L[nL2, Lini'i, L2nL'2 

be empty. We say that a set of inner edges / is compatible if for any two edges 
e,e' € /, the corresponding splits are compatible. These terms will be essential for 
computing geodesies in the tree space in Section |4] 

We will now proceed to construct a space of trees, that is, a space whose elements 
will be all metric n-trees, where n € N is a fixed number determined by the number 
of leaf vertices. The resulting space will be the tree space Tn- First, it is useful to 
realize that one can treat separately leaf edges and inner edges. Since the former 
can be represented in the Euclidean space of dimension n + 1, the whole space Tn 
is a product of a Euclidean factor and a factor representing the inner edges. We 
may for simplicity ignore the Euclidean factor in the following construction. 

Fix now a metric n-tree T with r inner edges of lengths li,. . . ,lr, where 1 < 
r < n — 2. Clearly (/i, . . . , 1^) lies in the open orthant (0, 00)'', and conversely, any 
point of (0, 00)'" corresponds to an n-tree of the same combinatorial structure as T. 
Note that a tree S is said to have the same combinatorial structure as T if it has 
the same number of inner edges as T and all its inner edges have the same splits 
as the inner edges of T. In other words, the trees S and T differ only by inner edge 
lengths. 

To any point from the boundary d{0,ooy we associate a metric n-tree obtained 
from T by shrinking some inner edges to zero length. Hence, each point from the 
closed orthant [0, ooY corresponds to a metric n-tree of the same combinatorial 
structure as T. 

Binary n-trees have the maximal possible number of inner edges, namely n — 2, 
which is of course equal to the dimension of the corresponding orthant. An orthant 
of an n-tree that is not binary appears as a face of the orthants corresponding to 
(at least three) binary trees. In Figure [2j we see a copy of [0,cxd)^ representing all 
4-trees of a given combinatorial structure, namely, all 4-trees with two inner edges 
ei and 62, such that the split of Ci is (1, 2|0, 3, 4), and the split of 62 is (1, 2, 3|0, 4). 
If the length of Ci is zero, then the tree lies on the vertical boundary ray. If the 
length of 62 is zero, then the tree lies on the horizontal boundary ray. In summary. 
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Figure 2. 4-trees of a given combinatorial structure. 

any orthant O — [0, oo)'', where I < r < n — 2^ corresponds to a compatible set 
of inner edges, and conversely, any compatible set of inner edges / = (ei,...,er) 
corresponds to a unique orthant 0{A), which is a copy of [0,cx))''. 

The tree space Tn consists of (2n — 3)(2n--5) 5-3 copies of orthants [0, oo)"~^ 

glued together along lower-dimensional faces, which correspond to non-binary trees, 
that is, compatible sets of inner edges of cardinality < n — 2. 

We equip the tree space Tn with the induced length metric. Then it becomes 
a geodesic space. One can easily observe that each geodesic consists of finitely 
many Euclidean line segments. The following important theorem states that the 
tree space has nonpositive curvature. 

Theorem 3.1. The space Tn is an Hadamard space. 

Proof. The proof is an application of Gromov's characterization of CAT(O) com- 
plexes, which requires the link of each vertex to be a flag complex. See |2J Lemma 4.1]. 

□ 



4. The algorithm for computing the Frechet mean 

This section contains the main result of the present paper: we shall introduce 
a novel algorithm for computing Frechet means in an Hadamard space ("H, d). 

4.1. Computing the Frechet mean in Hadamard spaces. Recall, that given 
a finite family of points xi,. . . ,xn G H, we have the existence and uniqueness 
of its Frechet mean S = S (xi, . . . , x^) due to Theorem 2.4 We now get to the 
question how to compute this average. The methods used in [Ej, namely Birkhoff's 
shortening, the centroid, and weighted averages, do not give a correct value as one 
can see in the following remark. 

Remark 4.1. Let {H,d) be an Hadamard space consisting of three geodesic rays 
issuing from the origin 0. This is an M-tree, and as a matter of fact the tree space Ts- 
Consider three points x,y, z e H lying in distinct rays issuing from the origin 
such that d{0,z) = 5, and d{0,x) — d{0,y) = 1, as depicted in Figure [Sj Then it 
is easy to see that the Frechet mean S — E{x,y,z) lies on the geodesic [0,z] and 
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d(0,S) = 1. On the other hand if we apply Bn-khofF's shortening or the centroid 
method, we wih get a point c G [0, z] such that d{Q,c) > |. Finahy, the weighted 
average of the points x, y, z is a point w G [0, z] with d{0, w) = |. 




Figure 3. The Frechet mean of three points. 



It turns out, however, that one can use Theorem |2.5| to find an approximation 
algorithm for computing the correct value of the Frechet mean. Let us now describe 
such an algorithm. It receives the points Xi, . . . ,xpf as the input, and at each 
iteration i G N produces a new point Si € H, which is an approximate version of 
the desired mean S = S (xi, . . . ,xn) in the sense that d (5^,2) as z — oo. This is 
guaranteed by Theorem 2^ We now give a formal and precise description of the 
algorithm. 



Algorithm for computing the Prechet mean 


Input: xi, . . . , xn 


Step 1 


Si Xi and i :— \ 


Step 2 


choose r G {1, . . . , N} at random 


Step 3 




Step 4 


i :={ + ! 


Step 5 


go to Step 2 



The main difficulty lies at Step 3. It requires computing the geodesic connecting 
Xr and Si, and hence strongly depends on the nature of the Hadamard space in 
question. We will in the sequel focus on the tree space, where we have an algorithm 
for computing geodesies, and it works in polynomial time [T2]. We should also like 
to mention some earlier attempts to find an algorithm for computing geodesies in 
the tree space, namely [Tll^ITT]. 

4.2. The Owen-Provan algorithm. In the remainder of this section we will de- 
scribe the algorithm for finding geodesies in a tree space Tn as presented in [T^ . 
Let T,T' G Tn be trees. As in the construction of the tree space in Section [sj we 
observe that the leaves of any tree on the geodesic between T and T' lie in the 
Euclidean factor of Tn, and hence we can restrict our attention only to the inner 
edges. Denote £ the set of inner edges of T, and £' the set of inner edges of T' . 
If the trees T and T' have a common inner edge, that is, if there exist e G £ and 
e' G £' which have the same splits, then this common edge will be present in any 



10 



MIROSLAV BACAK 



tree on the geodesic. This means that wc can remove this edge and solve the origi- 
nal problem for the two subtrees which arise by removing this common edge. Hence 
we may assume without loss of generality that T and T' have no edge in common. 

Let A = {Ai, . . . , Ak} , and B = {Bi, . . . , Bk} be partitions of £ and £' , respec- 
tively. If 

Bi U • • • U 5/ U Ai+i U---UAk 
is a compatible set for each Z G {1, . . . , fc}, then there exist corresponding orthants 
Oi — Oi {Bi U • • • U B; U Ai+i U • • • U Ak) in the tree space Tn- The finite sequence 
of such orthants V = (Oi, . . . ,Ok) is called a path space for the pair {T,T') . The 
pair {A, B) is called its support. The shortest curve through a path space which 
connects T and T' is called a path space geodesic. 

Theorem 4.2. Let T,T' (z Tn be trees with no edge in common. Then the geodesic 
connecting T and T' is a path space geodesic for some path space between T and T' . 

We shall now proceed to identify path space geodesies. For a set of inner edges A, 
denote 

ll^ll = 

where |e| stands for the length of e. 

Theorem 4.3. Let T,T' € Tn be trees with no edge in common. Then a curve 
7 : [0, 1] 7^ such that 7(0) = T and 7(1) = T' is a geodesic if and only if there 
exist partitions A = {Ai, . . . , A^} , and B — . . . , B^} of E and E' , respectively, 
satisfying the following conditions: 

(i) For any n > m, the sets An and B^ are compatible, 

(ii) The sets satisfy 




\\B,\\ - \\B2\\ - - IIBfcll' 
For each n £ {1, . . . , k}, there is no nontrivial partition Ci U C2 of An, 
and partition Di U D2 of Bn, such that C2 is compatible with £>i and, 

\\Ci\\ ^ IIC2II 



l^lll 11^2 



and 7 is a path space geodesic with support (A, B). 



The algorithm for computing a geodesic is based on Theorem 4.3 We start 
with the support , where A = E and B = E' , and with the path space 

geodesic 7° which consists of the line segment connecting T and 0, and the line 
segment connecting and T' . Having a path space geodesic 7' with support [A^ , &) 
satisfying conditions ^ and ^ of Theorem |4.3[ we check whether the condition ( pjij ) 
of Theorem |4.3| is also satisfied. If so, we have a geodesic, otherwise we construct a 
shorter path space geodesic 7'+^ with support {A^^^ ,B^^^) satisfying conditions ^ 



and ^ of Theorem [43] By [H Theorem 3.5] we know that this algorithm gives a 



geodesic in finitely many steps. 

Let us now take a closer look at the iterative step i — > i + 1, and reformulate 
it as the extension problem for bipartite graphs. Given sets A G E and B G E' , we 
define their incompatibility graph G{AU B,E) as a bipartite graph with the vertex 
set AU B, whose edges correspond to pairs e G A and f £ B with incompatible 
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splits. Clearly, two sets A C £ and B C £' are compatible if and only if they form an 
independent set in G{A[J B, E). We define the weight of a vertex to be the squared 
length of the corresponding tree edge. The extension problem for G{A\JB, E) asks 
whether there exist a partition Ci U C2 of A and partition Di U D2 o{ B such that 
C2 U Di is an independent set in G{A U B,E), and 

(8) 14 < 



J^lll 11^211 

Hence a path space geodesic 7' with support is a geodesic if and only if 

the extension problem has no solution for any pair (^^,5^) of (^A^,B'^) . 

To reformulate the extension problem, we first note that scaling the edge lengths 
does not affect Q, hence we multiply each edge length so that \\A\\ = \\B\\ = 1. 
Then Q is equivalent to 

||C2||' + pl||' > 1. 

Since the maximal matching problem is according to Konig's theorem equivalent to 
the minimal vertex cover problem, we need to find a vertex cover Ci U D2 of the 
graph G{A U B, E) such that 

||Ci||Vp2||'<l. 

For a general graph, this is a typical NP-hard problem (its decision version is NP- 
complete), but in case of bipartite graphs, it is classically solved via a flow network 
in polynomial time. As a reference on this subject, we recommend |13) . 

Indeed, we transform the bipartite graph G[AVJ B,E), depicted in Figure |4j 
into a flow network as follows. First we impose an orientation to all edges from E 




Figure 4. A bipartite graph with vertex weights. 



so that they go from A to B, and assign infinite capacity to each of these edges. 
Then we add a new vertex called a source, and connect it with each vertex of A so 
that all these edges are oriented from the source to A. Set the capacity of an edge 
going from the source to a vertex a e ^ to be the weight of a. In a similar way, 
we add a new vertex called a sink, and connect all vertices of B with it. These 
edges are oriented from B to the sink and have capacities equal to the weights of 
the vertices of B. We obtained a flow network as in Figure [Sj The aim now is to 
push as much of a flow from the source to the sink as possible. A maximal flow 
then gives a minimal cut by the max flow - min cut theorem, and a minimal cut 
is exactly the desired vertex cover, as one can easily observe. There exist many 
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source 




Figure 5. The flow network with edge capacities. 



algorithms which give a maximal flow in polynomial time. In our implementation 
we choose the push-relabel algorithm due to A. Goldberg and R. Tarjan [7,. 

If two trees have common edges, the decomposition into subtrees with no com- 
mon edge can be done in linear time. Hence the whole Owen-Provan algorithm for 
finding a geodesic in the tree space works in polynomial time. 
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